[repo] Contribution guide#2859
Conversation
|
Repository collaborators can run the JMH benchmark suite against this PR by commenting: Optional regression threshold override (Δ% on Time or Alloc/op; defaults to 10%): Only one benchmark run per PR is active at a time — issuing a new |
| - what tests will prove the behavior | ||
| - whether documentation, `CHANGELOG.md`, or `docs/features.md` needs to change | ||
|
|
||
| It is fine to use AI tools to draft an implementation proposal, explore design options, or prepare a first implementation. AI-generated code is welcome when the specification is detailed, the resulting code follows project patterns, and the tests clearly validate the expected behavior. Special attention should be paid to tests because they are the main guardrails for code changes. |
There was a problem hiding this comment.
AI-generated code is welcome when
should we add AI_POLICY.md and link it here?
There was a problem hiding this comment.
yes. Will add link.
|
|
||
| Contributions are always welcome: issues, pull requests, questions, ideas, tests, documentation, and examples all help the project. | ||
|
|
||
| ClickHouse Java is maintained by a team responsible for the stability and quality of released artifacts. Because this repository publishes client libraries and drivers used by applications, every contribution needs enough context and validation for maintainers to review its safety. At minimum, a code change should include a clear description and tests that prove the intended behavior. |
There was a problem hiding this comment.
every contribution needs enough context and validation for maintainers to review its safety
I'd suggest rephrase to something > here's what makes them easy to review, otherwise, it reads too defensively
|
|
||
| Contributions are always welcome: issues, pull requests, questions, ideas, tests, documentation, and examples all help the project. | ||
|
|
||
| ClickHouse Java is maintained by a team responsible for the stability and quality of released artifacts. Because this repository publishes client libraries and drivers used by applications, every contribution needs enough context and validation for maintainers to review its safety. At minimum, a code change should include a clear description and tests that prove the intended behavior. |
There was a problem hiding this comment.
Because this repository publishes client libraries and drivers used by applications, every contribution needs enough context and validation for maintainers to review its safety.
I'd even strengthen it
The libraries are foundational infrastructure for the software built on top of them, thus a regression or breaking change in a client propagates to applications that depends on it, often with limited ability for users to work around the issue
There was a problem hiding this comment.
Maybe
ClilckHouse client libraries are deployed in production environments all over the world. Our users and customers depend on them being thoughtfully designed, well documented, and highly reliable. The high bar for quality we apply to all our software must apply to external contributions as well.
There was a problem hiding this comment.
What about:
Libraries in this repository are used in many production systems. Users expect certain level of quality and
active support (adding new features, fixing issues). It is important to make all changes safe and swiftly.
There we ask all contributors for tests and clear description of changes done in a PR. This expedites merge and quadruples value of your contribution.
|
|
||
| ### Big Contributions | ||
|
|
||
| Large changes are easier to review and merge when they are split into a few smaller pull requests. From our experience, it is often better to prepare the codebase first with focused cleanup, refactoring, or test-only PRs, and then add the feature or behavior change in a later PR. |
There was a problem hiding this comment.
The principle is right but the guidance doesn't have an easy to follow definition. Let's give contributors a concrete threshold. For example (numbers are taken from this study https://www.propelcode.ai/blog/pr-size-impact-code-review-quality-data-study):
As a rough guideline, if a pull request is over 300–400 lines of code (excluding docs and tests), consider whether it can be split. Smaller PRs are reviewed and merged faster, and the savings in review time usually outweigh the overhead of splitting
There was a problem hiding this comment.
We all know that large PRs (400+ lines of code) are hard to review and it is more likely to miss a bug. Small changes are appreciated and take less time to verify them. If many changes are needed anyway (feature implementation, refactoring), please split them: something can be done as code preparation, cleanup or smaller improvement. It should take less time at sum as this repository is actively maintained. We will ask to split changes of 800+ LOC anyway.
| - a link to the related issue, when one exists | ||
| - an explanation of the user-visible behavior and compatibility impact | ||
| - tests for the changed behavior, including integration tests when the change interacts with ClickHouse server behavior | ||
| - the local test commands that were run |
There was a problem hiding this comment.
not sure we need that level of noize in every PR. I'd just set expectations: you run test before opening a PR
There was a problem hiding this comment.
This kind of falls under the stale-PR point above. If the user is clearly not running tests before submitting, CI will be consistently failing and that's a fair reason for us to close the PR.
There was a problem hiding this comment.
Ok. I will rephrase
- a link to the related issue
- compatibility impact and behavior changes
- unit and integration test
Make sure all existing and new tests are passing. PRs with failing CI will stay open.
|
|
||
| Changes to CI workflows are currently restricted. Please discuss CI workflow changes with maintainers before opening a pull request. | ||
|
|
||
| ## Pull Request Checklist |
There was a problem hiding this comment.
I'd add a couple of things to set clearer expectations on both sides:
-
CI must be runnable on the PR. For external contributors, the initial CI run requires approval from a maintainer (as of today, probably should be revisited). Let’s explicitly mention this and commit to triggering it promptly to avoid contributors being stuck waiting. For internal PRs, the CI suite should actually run against the proposed change before the review process begins. If tests are scoped down, skipped, or disabled, the PR should clearly indicate this.
-
Reviews occur when CI is green. Clearly state that the team will review the code once CI passes. Additionally, contributors should either fix PRs with failing CI or move them to the draft. This approach is fair in both directions: contributors have a clear understanding of what means
ready for review, and reviewers avoid spending cycles on changes that are not ready
There was a problem hiding this comment.
I would also mention:
- Out-of-date branches and merge conflicts should be resolved promptly (GitHub should send an email notifying when this happens)
- PRs that remain open for extended periods with unresolved conversations, merge conflicts or failing CI may be closed at maintainer discretion
- I learned the hard way that a PR author cannot re-open a PR that they did not close themselves. This should mention that if we close a PR for being stale and they want to come back to it, they will need to open a new PR.
| @@ -1,125 +1,194 @@ | |||
| ## Getting started | |||
| ClickHouse-JDBC client is an open-source project, and we welcome any contributions from the community. Please share your ideas, contribute to the codebase, and help us maintain up-to-date documentation. | |||
| # Contributing to ClickHouse Java | |||
There was a problem hiding this comment.
Most of the guidance provided in the document (Testing, Contribution Process, significant contributions, and the PR checklist) is also relevant to AI coding agents. Considering the significant portion of our contributions made by agents, we should either mirror these rules into AGENTS.md or extract them into a shared file that both documents reference.
This can be done in a separate PR.
There was a problem hiding this comment.
Can the AGENTS.md just say "please also read CONTRIBUTING.md"?
There was a problem hiding this comment.
@mshustov
I agree. Will write something.
@abonander
This might be a good idea.
|
|
||
| Some good ideas need time to design and implement. We are open to discussion and welcome collaboration in issues, even when the final implementation is not ready yet. | ||
|
|
||
| ### Big Contributions |
There was a problem hiding this comment.
One logical change per PR: a feature, a bug fix, a refactor, or a doc update. Mixed-purpose PRs complicate reviews and increase the likelihood of unintended regressions.
|
|
||
| Large changes are easier to review and merge when they are split into a few smaller pull requests. From our experience, it is often better to prepare the codebase first with focused cleanup, refactoring, or test-only PRs, and then add the feature or behavior change in a later PR. | ||
|
|
||
| Avoid packing unrelated cleanup, infrastructure changes, test rewrites, and the feature itself into one large pull request. Smaller PRs help maintainers verify intent, reduce review time, and lower the risk of regressions. |
There was a problem hiding this comment.
For a new feature, implement the smallest change that solves the problem. Polish, additional configuration, optimization, and extensions can be addressed in the follow-up PRs
| - a link to the related issue, when one exists | ||
| - an explanation of the user-visible behavior and compatibility impact | ||
| - tests for the changed behavior, including integration tests when the change interacts with ClickHouse server behavior | ||
| - the local test commands that were run |
There was a problem hiding this comment.
- run the project's code review skill against your diff before requesting human review
this and run tests locally can be moved to before you open a PR
|
|
||
| Every pull request should include: | ||
|
|
||
| - a clear description of what changed and why |
There was a problem hiding this comment.
| - a clear description of what changed and why | |
| - a clear description of what changed, why and why now |
|
I really like where this is going. The only thing I'm weary of is that the longer it gets, the less likely a human is going to read the entire thing. An agent will no problem, so the length with detailed justifications and directions is great, but it might be worth having a short bulleted TLDR section at the top that provides md links to the relevant sections? e.g: ⚡ TLDR
- Proposals: Small fix? Just open a PR. Big feature or API change? Open an [Issue](#issues-discussion-and-proposals) first.
- PR Requirements: Every PR needs a clear description, tests, and a CHANGELOG.md update. See the full [PR Checklist](#pull-request-checklist).
- Testing: Code changes require both positive and negative tests. Keep them focused. More in [Testing Expectations](#testing-expectations).
- Quick Build Command (JDK 17+): `mvn -DskipITs` clean verify (Requires [Toolchains setup](#set-up-the-environment)).
- Run Local Tests: Unit tests run automatically. For Integration Tests, ensure Docker is running. Learn how to [Run Unit and Integration Tests](#run-unit-and-integration-tests). |
|
thank you for the feedback! Document is mainly focused for people not agents. I will add suggested section at header and will try to minimize wording to make document shorter and may be bit more structured. |
Summary
Reworked contribution guide to tell more about process and expectations.
We should value every ones time and reduce review cycles so we explain how to prepare PR for best result.